翻訳と辞書
Words near each other
・ Reservoir Fissure
・ Reservoir fluids
・ Reservoir Football Club
・ Reservoir High School
・ Reservoir High School (Victoria)
・ Reservoir Hill, Baltimore
・ Reservoir Hole
・ Reservoir Media Management
・ Reservoir modeling
・ Reservoir of Ullíbarri-Gamboa
・ Reservoir Park (Massachusetts)
・ Reservoir Park (Pennsylvania)
・ Reservoir Pups
・ Reservoir railway station
・ Reservoir Records
Reservoir sampling
・ Reservoir simulation
・ Reservoir Songs
・ Reservoir State Park
・ Reservoir war
・ Reservoir Woods Park
・ Reservoir, California
・ Reservoir, Providence, Rhode Island
・ Reservoir, Victoria
・ Reservoir, Western Australia
・ Resesi
・ Reset
・ Reset (Atari Teenage Riot album)
・ Reset (Canadian band)
・ Reset (computing)


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Reservoir sampling : ウィキペディア英語版
Reservoir sampling

Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of ''k'' items from a list ''S'' containing ''n'' items, where ''n'' is either a very large or unknown number. Typically ''n'' is large enough that the list doesn't fit into main memory.
== Algorithm R ==
The most common example was labelled ''Algorithm R'' by Jeffrey Vitter in his paper on the subject. This simple O(''n'') algorithm as described in the ''Dictionary of Algorithms and Data Structures''〔( Dictionary of Algorithms and Data Structures )〕 consists of the following steps (assuming that the arrays are one-based, and that the number of items to select, ''k'', is smaller than the size of the source array, ''S''):

/
*
S has items to sample, R will contain the result
*/
ReservoirSample(S(), R())
// fill the reservoir array
for i = 1 to k
R() := S()
// replace elements with gradually decreasing probability
for i = k+1 to n
j := random(1, i) // important: inclusive range
if j <= k
R() := S()

The algorithm creates a "reservoir" array of size ''k'' and populates it with the first ''k'' items of ''S''. It then iterates through the remaining elements of ''S'' until ''S'' is exhausted. At the ''i''th element of ''S'', the algorithm generates a random number ''j'' between 1 and ''i''. If ''j'' is less than or equal to ''k'', the ''j''th element of the reservoir array is replaced with the ''i''th element of ''S''. In effect, for all ''i'', the ''i''th element of ''S'' is chosen to be included in the reservoir with probability ''k/i''. Similarly, at each iteration the ''j''th element of the reservoir array is chosen to be replaced with probability ''1/k
* k/i'', which simplifies to ''1/i''. It can be shown that when the algorithm has finished executing, each item in ''S'' has equal probability (i.e. ''k/length(S)'') of being chosen for the reservoir.
To see this, consider the following proof by induction. After the (i-1)th round, let us assume, the probability of a number being in the reservoir array is ''k/(i-1)''. Since the probability of the number being replaced in the ithround is ''1/i'', the probability that it survives the ith round is ''(i-1)/i''. Thus, the probability that a given number is in the reservoir after the ith round is the product of these two probabilities, i.e. the probability of being in the reservoir after the (i-1)th round, and surviving replacement in the ith round. This is ''(k/(i-1))
* ((i-1)/i)=k/i''. Hence, the result holds for ''i'', and is therefore true by induction.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Reservoir sampling」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.